Improving Neural Language Models with a Continuous Cache
نویسندگان
چکیده
We propose an extension to neural network language models to adapt their prediction to the recent history. Our model is a simplified version of memory augmented networks, which stores past hidden activations as memory and accesses them through a dot product with the current hidden activation. This mechanism is very efficient and scales to very large memory sizes. We also draw a link between the use of external memory in neural network and cache models used with count based language models. We demonstrate on several language model datasets that our approach performs significantly better than recent memory augmented networks.
منابع مشابه
Unbounded cache model for online language modeling with open vocabulary
Recently, continuous cache models were proposed as extensions to recurrent neural network language models, to adapt their predictions to local changes in the data distribution. These models only capture the local context, of up to a few thousands tokens. In this paper, we propose an extension of continuous cache models, which can scale to larger contexts. In particular, we use a large scale non...
متن کاملOn Continuous Space Word Representations as Input of LSTM Language Model
Artificial neural networks have become the state-of-the-art in the task of language modelling whereas Long-Short Term Memory (LSTM) networks seem to be an efficient architecture. The continuous skip-gram and the continuous bag of words (CBOW) are algorithms for learning quality distributed vector representations that are able to capture a large number of syntactic and semantic word relationship...
متن کاملLearning to Remember Translation History with a Continuous Cache
Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information. In this work, we propose to augment NMT models with a very light-weight cache-like memory network, which stores recent hidden representations as translation history. The probability distribution over generated words is updated onli...
متن کاملAre Deep Neural Networks the Best Choice[2] for Modeling Source Code?
Current statistical language modeling techniques, including deeplearning based models, have proven to be quite effective for source code. We argue here that the special properties of source code can be exploited for further improvements. In this work, we enhance established language modeling approaches to handle the special challenges of modeling source code, such as: frequent changes, larger, ...
متن کاملAccelerating Convolutional Neural Networks for Continuous Mobile Vision via Cache Reuse
Convolutional Neural Network (CNN) is the state-ofthe-art algorithm of many mobile vision fields. It is also applied in many vision tasks such as face detection and augmented reality on mobile devices. Though benefited from the high accuracy achieved via deep CNN models, nowadays commercial mobile devices are often short in processing capacity and battery to continuously carry out such CNN-driv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1612.04426 شماره
صفحات -
تاریخ انتشار 2016